Implementing Vocal Tract Le in the Mllr Fram

نویسندگان

  • Guo-Hong Ding
  • Yi-Fei Zhu
  • Bo Xu
چکیده

Vocal Tract Length Normalization (VTLN) and Maximum Likelihood Linear Regression (MLLR) are two approaches to reduce the degradation in speech recognition performance caused by variation of speakers. This paper derives a novel efficient adaptation algorithm from the two techniques. Based on prior knowledge of usual VTLN, an approximate constrained-form linear transformation is obtained. The transformation is learned using EM algorithm and then applied in the MLLR setting. Experiments of three tasks are performed on an isolated word recognition system. Experimental results shows that with several adaptation words, WER is decreased greatly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An investigation into vocal tract length normalisation

This paper investigates several di erent methods for performing vocal tract length normalisation (VTLN) which are either completely linear or piece-wise linear. Furthermore the combination of VTLN with either standard unconstrained maximum likelihood linear regression (MLLR) or constrained MLLR is considered. Results on the Switchboard corpus show that there is little di erence in performance b...

متن کامل

Speaker normalization and speaker adaptation - a combination for conversational speech recognition

Speaker normalization and speaker adaptation are two strategies to tackle the variations from speaker, channel, and environment. The vocal tract length normalization (VTLN) is an e ective speaker normalization approach to compensate for the variations of vocal tract shapes. The Maximum Likelihood Linear Regression(MLLR) is a recent proposed method for speaker-adaptation. In this paper, we propo...

متن کامل

Investigation of Speaker-Clustered UBMs based on Vocal Tract Lengths and MLLR matrices for Speaker Verification

It is common to use a single speaker independent large Gaussian Mixture Model based Universal Background Model (GMMUBM) as the alternative hypothesis for speaker verification tasks. The speaker models are themselves derived from the UBM using Maximum a Posteriori (MAP) adaptation technique. During verification, log likelihood ratio is calculated between the target model and the GMM-UBM to accep...

متن کامل

Experiments in speaker normalisation and adaptation for large vocabulary speech recognition

This paper examines techniques for speaker normalisation and adaptation that are applied in training with the aim of removing some of the variability from the speaker independent models. Two techniques are examined: vocal tract normalisation (VTN) which estimates a single \vocal tract length" parameter for each speaker and then modi es the speech parameterisation accordingly and speaker adaptiv...

متن کامل

Effects of Voice Therapy on Vocal Tract Discomfort in Muscle Tension Dysphonia

Introduction: Patients with muscle tension dysphonia (MTD) suffer from several physical discomforts in their vocal tract. However, few studies have examined the effects of voice therapy (VT) on the vocal tract discomfort (VTD) in patients with voice disorders. Therefore, the aim of the present study was to investigate the effects of VT on the VTD in patients with MTD.   Materi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002